-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compute memory usage per location #538
Conversation
b7e1fe2
to
29016bc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's live review
#include <mlir/IR/Operation.h> | ||
#include <mlir/Pass/Pass.h> | ||
|
||
#include <concretelang/Support/CompilationFeedback.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So here and same reflexion for statistics, why not follow the common pass pattern?
|
||
namespace mlir { | ||
namespace concretelang { | ||
namespace Concrete { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This enter/exit pass could be factorized
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or even more the nested loop logic too
namespace concretelang { | ||
namespace Concrete { | ||
|
||
struct MemoryUsagePass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following the common pass pattern, this should be hidden in cpp and expose uniquely a createAnalysisConcreteMemoryUsage
function.
|
||
namespace { | ||
|
||
template <typename Op> std::string locationOf(Op op) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so all this things could be factorized
pass.iterations /= (uint64_t)numberOfIterations; | ||
return std::nullopt; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
until here
return StringError( | ||
"allocation of buffer with a non-supported element-type"); | ||
for (auto size : shape) { | ||
if (size == mlir::ShapedType::kDynamic) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that actually happens?
continue; | ||
// allocation inside the current location are computed separately | ||
auto definingOp = operand.getDefiningOp(); | ||
if (definingOp && mlir::isa<memref::AllocaOp>(definingOp) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess typo here, AllocOp
not AllocaOp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch!
return maybeBufferSize.error(); | ||
} | ||
auto bufferSize = maybeBufferSize.value(); | ||
auto numberOfUsers = numberOfUsersInLocation(operand, op->getLoc()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its a bit hacky here and I would say it's not working (more than the rounding that could over estimate a bit the memory usage).
Let's consider this example :
%alloc_a = memref.alloc() : memref<...>
...
%alloc_b = memref.alloc() : memref<...>
scf.for ... {
...
scf.for {
%sub_a = memref.subview %alloc_a
%sub_b = memref.subview %alloc_b
op(%sub_a, %sub_b)
}
}
Here the b
location has as memory usage = %alloc_a + %alloc_b + %sub_a + %sub_b
while its actually memory footprint should be just %alloc_a + %alloc_b. I think for each location you need to have a set of the actual buffers used, and for each operand of operations you should find back the actual origin of the buffer (i.e. a block argument or the allocation) and update the set.
Typically for the location b here you will have b => %alloc_a, %alloc_b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I don't think we should consider subviews if we already considered the original buffer, as this shouldn't access a memory region that wasn't accessed before
- The division is trying to avoid adding a memory buffer multiple times, just because it has been used by different operations, and in this case, subview is an operation that should be ignored, and thus, they aren't added by the ops themselves, but the
op
op. However, keeping the relationship between subviews and views isn't straightforward.
return error; | ||
} | ||
} | ||
// ignore ops that have no memory effect |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure that is necessary to skip memory effect free ops... ??
f3c36c5
to
e168369
Compare
} | ||
} | ||
// we already count allocations separately | ||
if (definingOp && mlir::isa<memref::AllocOp>(definingOp) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this code is not necessary and we should not have a specific handle of the AllocOp, that will make the code simplest, wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alloc is a simpler case than the generic function which needs to search for the defining op. And as we can add other simple operations in the future, it would be easier to add their respective functions (just like alloc), and ignore them here, instead of extending the logic here (adding two or three more operations here might start making it much more complex)
compilers/concrete-compiler/compiler/lib/Dialect/Concrete/Analysis/MemoryUsage.cpp
Show resolved
Hide resolved
compilers/concrete-compiler/compiler/lib/Dialect/Concrete/Analysis/MemoryUsage.cpp
Outdated
Show resolved
Hide resolved
compilers/concrete-compiler/compiler/lib/Support/CompilationFeedback.cpp
Outdated
Show resolved
Hide resolved
8dd6f9c
to
40cc142
Compare
…circuit parametrization Since conversions at the edge of partitions determined by the optimizer are specified at the producer of a value, `MaterializePartitionBoundaryPattern` may agerly replace references to the original value before conversion, resulting in conflicts. This commit adds a check to the conflict handling routine that checks if the producer of an operation with a conflicting operand is a keyswitch operation that converts the original value of an extra conversion with the required, non-conflicting type. If this is the case, the conflict is handled by simply forwarding the original value from the operand of the producing keyswitch operation. Resolves Issue #538 (zama-ai/concrete-internal#538).
…circuit parametrization Since conversions at the edge of partitions determined by the optimizer are specified at the producer of a value, `MaterializePartitionBoundaryPattern` may agerly replace references to the original value before conversion, resulting in conflicts. This commit adds a check to the conflict handling routine that checks if the producer of an operation with a conflicting operand is a keyswitch operation that converts the original value of an extra conversion with the required, non-conflicting type. If this is the case, the conflict is handled by simply forwarding the original value from the operand of the producing keyswitch operation. Resolves Issue #538 (zama-ai/concrete-internal#538).
…circuit parametrization Since conversions at the edge of partitions determined by the optimizer are specified at the producer of a value, `MaterializePartitionBoundaryPattern` may agerly replace references to the original value before conversion, resulting in conflicts. This commit adds a check to the conflict handling routine that checks if the producer of an operation with a conflicting operand is a keyswitch operation that converts the original value of an extra conversion with the required, non-conflicting type. If this is the case, the conflict is handled by simply forwarding the original value from the operand of the producing keyswitch operation. Resolves Issue #538 (zama-ai/concrete-internal#538).
…circuit parametrization Since conversions at the edge of partitions determined by the optimizer are specified at the producer of a value, `MaterializePartitionBoundaryPattern` may eagerly replace references to the original value before conversion, resulting in conflicts. This commit adds a check to the conflict handling routine that checks if the producer of an operation with a conflicting operand is a keyswitch operation that converts the original value of an extra conversion with the required, non-conflicting type. If this is the case, the conflict is handled by simply forwarding the original value from the operand of the producing keyswitch operation. Resolves Issue #538 (zama-ai/concrete-internal#538).
…circuit parametrization Since conversions at the edge of partitions determined by the optimizer are specified at the producer of a value, `MaterializePartitionBoundaryPattern` may eagerly replace references to the original value before conversion, resulting in conflicts. This commit adds a check to the conflict handling routine that checks if the producer of an operation with a conflicting operand is a keyswitch operation that converts the original value of an extra conversion with the required, non-conflicting type. If this is the case, the conflict is handled by simply forwarding the original value from the operand of the producing keyswitch operation. Resolves Issue #538 (zama-ai/concrete-internal#538).
No description provided.